Unsupervised acoustic model training using multiple seed ASR systems

نویسندگان

Horia Cucu

Andi Buzo

Corneliu Burileanu

چکیده

Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic model for any under-resourced language. This paper explores the novel idea of using two independent ASR systems to transcribe new speech data, align and filter the ASR hypotheses and use the presumably correct transcriptions to iteratively improve the two seed ASR systems. In parallel, the newly transcribed speech is used to retrain the mainstream ASR system. The methodology leads to WER relative improvements of 5.5% after the first iteration. The experiments are made with data in the Romanian language. Index Terms unsupervised acoustic modeling, speech recognition, unsupervised training, under-resourced languages

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Réduction des coûts de développement de systèmes de reconnaissance de la parole à grand vocabulaire. (Reducing development costs of large vocabulary speech recognition systems)

One of the outstanding challenges in large vocabulary automatic speech recognition (ASR) is the reduction of development costs required to build a new recognition system or adapt an existing one to a new task, language or dialect. The state-of-the-art ASR systems are based on the principles of the statistical learning paradigm, using information provided by two stochastic models, an acoustic (A...

متن کامل

The use of sense in unsupervised training of acoustic models for ASR systems

In unsupervised training of ASR systems, no annotated data are assumed to exist. Word-level annotations for training audio are generated iteratively using an ASR system. At each iteration a subset of data judged as having the most reliable transcriptions is selected to train the next set of acoustic models. Data selection however remains a difficult problem, particularly when the error rate of ...

متن کامل

Acoustic model selection for recognition of regional accented speech

Accent is cited as an issue for speech recognition systems [1]. Research has shown that accent mismatch between the training and the test data will result in significant accuracy reduction in Automatic Speech Recognition (ASR) systems. Using HMM based ASR trained on a standard English accent, our study shows that the error rates can be up to seven times higher for accented speech, than for stan...

متن کامل

A Supervised Factorial Acoustic Model for Simultaneous Multiparticipant Vocal Activity Detection in Close-talk Microphone Recordings of Meetings

Using automatic speech recognition (ASR) word error rates (WERs) as a metric, the systems in (1) and (3) appear to have yield similar performance, in spite of significant additional architectural differences. Systems of type (2) have not been fielded for segmentation for ASR, and therefore cannot be directly compared. Although approaches of type (3) offer a significant advantage, namely the opp...

متن کامل

Speech alignment and recognition experiments for Luxembourgish

Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Unsupervised acoustic model training using multiple seed ASR systems

نویسندگان

چکیده

منابع مشابه

Réduction des coûts de développement de systèmes de reconnaissance de la parole à grand vocabulaire. (Reducing development costs of large vocabulary speech recognition systems)

The use of sense in unsupervised training of acoustic models for ASR systems

Acoustic model selection for recognition of regional accented speech

A Supervised Factorial Acoustic Model for Simultaneous Multiparticipant Vocal Activity Detection in Close-talk Microphone Recordings of Meetings

Speech alignment and recognition experiments for Luxembourgish

عنوان ژورنال:

اشتراک گذاری